Fast protein fragment similarity scoring using a Binet-Cauchy kernel
نویسندگان
چکیده
MOTIVATION Meaningful scores to assess protein structure similarity are essential to decipher protein structure and sequence evolution. The mining of the increasing number of protein structures requires fast and accurate similarity measures with statistical significance. Whereas numerous approaches have been proposed for protein domains as a whole, the focus is progressively moving to a more local level of structure analysis for which similarity measurement still remains without any satisfactory answer. RESULTS We introduce a new score based on Binet-Cauchy kernel. It is normalized and bounded between 1-maximal similarity that implies exactly the same conformations for protein fragments-and -1-mirror image conformations, the unrelated conformations having a null mean score. This allows for the search of both similar and mirror conformations. In addition, such score addresses two major issue of the widely used root mean square deviation (RMSD). First, it achieves length independent statistics even for short fragments. Second, it shows better performance in the discrimination of medium range RMSD values. Being simpler and faster to compute than the RMSD, it also provides the means for large-scale mining of protein structures. AVAILABILITY AND IMPLEMENTATION The computer software implementing the score is available at http://bioserv.rpbs.univ-paris-diderot.fr/BCscore/ CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
BCSearch: fast structural fragment mining over large collections of protein structures
Resources to mine the large amount of protein structures available today are necessary to better understand how amino acid variations are compatible with conformation preservation, to assist protein design, engineering and, further, the development of biologic therapeutic compounds. BCSearch is a versatile service to efficiently mine large collections of protein structures. It relies on a new a...
متن کاملBinet-Cauchy Kernels
We propose a family of kernels based on the Binet-Cauchy theorem and its extension to Fredholm operators. This includes as special cases all currently known kernels derived from the behavioral framework, diffusion processes, marginalized kernels, kernels on graphs, and the kernels on sets arising from the subspace angle approach. Many of these kernels can be seen as the extrema of a new continu...
متن کاملA Nonlinear Scoring Framework for Peptide Identification via Tandem Mass Spectrometry
The problem of false positives in peptide identification via tandem mass spectrometry (MS/MS) by database searching remains unsatisfactorily resolved in the current proteomics research. The correlative information among fragment ions in the MS/MS spectrum can be very helpful for reducing the number of false positives. However, due to the computational difficulty, existing peptide-scoring algori...
متن کاملDeveloping optimal non-linear scoring function for protein design
UNLABELLED Motivation. Protein design aims to identify sequences compatible with a given protein fold but incompatible to any alternative folds. To select the correct sequences and to guide the search process, a design scoring function is critically important. Such a scoring function should be able to characterize the global fitness landscape of many proteins simultaneously. RESULTS To find o...
متن کاملThe solution of the Binet-Cauchy functional equation for square matrices
Heuvers, K.J. and D.S. Moak, The solution of the Binet-Cauchy functional equation for square matrices, Discrete Mathematics 88 (1991) 21-32. It is shown that if f : M,(K)+ K is a nonconstant solution of the Binet-Cauchy functional equation for A, B E M,,(K) and if f(E) = 0 where E is the n x n matrix with all entries l/n then f is given by f(A) = m(det A) where m is a multiplicative function on...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 30 6 شماره
صفحات -
تاریخ انتشار 2014